

Institut für Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel

Karlsruhe Institute of Technology

#### Vorlesung im SS 2016

# **Reconfigurable and** Adaptive Systems (RAS)

Marvin Damschen, Lars Bauer, Jörg Henkel



### Organisation

- Lecture time:
- Homepage:

- Slides Login:
- Contact:

Mi., 15.45 – 17.15 Bld. 50.34, HS -102 http://ces.itec.kit.edu/1317\_1320.php You can find the slides from previous years on our old homepage: http://cesweb.itec.kit.edu/teaching/ Login: "student" Passwd: "CES-Student" marvin.damschen@kit.edu Haid-und-Neu-Str. 7 Bld. 07.21, Rm. B2–314.4 (B–Wing, 2<sup>nd</sup> Floor)

### CES @ Technologiefabrik (TFI)



M. Damschen, KIT, 2016

### Questions during the lecture



> Simply let me know / interrupt me

M. Damschen, KIT, 2016

# RAS Exam

#### CS Diploma:

 Vertiefungsfach 8: Entwurf eingebetteter Systeme und Rechnerarchitekturen

#### CS Master:

- Modul: Rekonfigurierbare und Adaptive Systeme [IN4INRAS] (3 ECTS)
- Modul: Eingebettete Systeme: Weiterführende Themen [IN4INESWTN] (10 ECTS)
- Modul: Advanced Computer Architecture [IN4INACA] (10 ECTS)
- Other Study Courses (e.g. EE): ask individually

# Teaching @ CES, SS 2016

#### Lectures

- RAS
- Low Power Design
- Labs
  - Entwurf eingebetteter Systeme
  - Entwurf von eingebetteten applikationsspezifischen Prozessoren
  - Low Power Design and Embedded Systems

#### Seminars

- <u>Rekonfigurierbare</u>
   <u>Eingebettete Systeme</u>
- Dependability in Embedded Systems
- Stereo Video Processing
- Multicore for Multimedia
   Processors
- Low Power Design for Embedded Systems
- Internet of Things for Healthcare

#### More Info: <u>http://ces.itec.kit.edu/26.php</u>

#### Theses @ CES



#### **Chair for Embedded Systems**

**Chair for Embedded Systems** 

CES

Prof. Dr. J. Henkel

|                  | Available Theses                                                                                                      | Ongoing student works Con                               | npleted student   | works                                         |                                                                         |
|------------------|-----------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|-------------------|-----------------------------------------------|-------------------------------------------------------------------------|
| elcome           |                                                                                                                       | Abbreviation: D - Dipl                                  | oma Thesis, M - M | laster Thesis,                                | S - Student Work, B - Bachelor Thes                                     |
| vents/News       | Торіс                                                                                                                 |                                                         |                   | Type of<br>work                               | Mentor                                                                  |
| ublications      | Analyse von Rekonfigurierbaren Mehrkernsystemen für Echtzeitkritische<br>Anwendungen ( > PDF 🖒 )                      |                                                         | D/M/B             | Damschen, Marvin/Bauer, La                    |                                                                         |
| eople            | ECG processing for IoT-based healthcare devices ( ▶ PDF ☑ )                                                           |                                                         |                   | D/M                                           | →Samie, Farzad />Bauer, Lars                                            |
| esearch          | Context-Aware data compression for Internet-of-Things (IoT) ( ▶ PDF 🖾 )                                               |                                                         | oT) ( • PDF 🖄 )   | D/M                                           | ∙Samie, Farzad /•Bauer, Lars                                            |
| eaching          | Approximate Image Processing in Android-based IoT devices ( $\blacktriangleright \text{PDF} \ensuremath{ \square}$ )  |                                                         |                   | B/M                                           | <ul> <li>Castro-Godínez, Jorge /</li> <li>Shafique, Muhammad</li> </ul> |
| ES free Software | Approximate Motion Tracking in Android-based IoT devices ( $\blacktriangleright \text{PDF}$ 🗹 )                       |                                                         |                   | B/M                                           | Castro-Godínez, Jorge /<br>Shafique, Muhammad                           |
| ob Openings      | Automatic Approximate Accelerator Generation using High Level Synthesis Tools ( ${\bf +PDF}$ ${\bf C}$ )              |                                                         | B/M               | Castro-Godínez, Jorge /<br>Shafique, Muhammad |                                                                         |
| Internals        | OpenCL/CUDA Progra                                                                                                    | aming for Reliability Analysis ( <b>PDF</b>             | 2)                | D/M/B                                         | ∙van Santen, Victor / ∙Amrouch<br>Hussam                                |
|                  | Reliability of Electrica                                                                                              | I Circuits ( ▶ PDF ☑ )                                  |                   | D/M/B                                         | →van Santen, Victor / →Amrouch<br>Hussam                                |
|                  | Degradation Effects in                                                                                                | n Microprocessors ( ▶ PDF ☑ )                           |                   | D/M/B                                         | ∙van Santen, Victor / ∙Amrouch<br>Hussam                                |
|                  | Design and implement processors ( ) PDF @                                                                             | ntation of hardware accelerators for r<br>)             | econfigurable     | В                                             | Kerekare, Srinivas Rao / Bau<br>Lars                                    |
|                  | Area and Energy eva<br>a reconfigurable proc                                                                          | luation of a combined ASIC+FPGA in<br>essor ( ▶ PDF ☑ ) | plementation of   | В                                             | ›Kerekare, Srinivas Rao /<br>›Damschen, Marvin / ≀Bauer, Li             |
|                  | Entwicklung von Spez<br>▶ PDF ☑ )                                                                                     | ialbefehlen zur Lösung von Shallow ۱                    | Nater Equations ( | В                                             | Damschen, Marvin/Bauer, La                                              |
|                  | Simulation eines Rekonfigurierbaren Systems für Zeitkritische<br>Anwendungen ( > PDF 🖒 )                              |                                                         | В                 | Damschen, Marvin/Bauer, La                    |                                                                         |
|                  | Entwurf einer Hardware/Software Co-Simulations-plattform für<br>Rekonfigurierbare Architekturen ( <b>&gt; PDF</b> 🗹 ) |                                                         | S/B               | →Zhang, Hongyan />Bauer, Lars                 |                                                                         |
|                  | Design einer grafischen Oberfläche in Qt für die Visualisierung eines<br>Multiagentensystems ( <b>+ PDF</b> ☎ )       |                                                         | в                 | •Wenzel, Volker                               |                                                                         |

M. Damschen, KIT, 2016

### Theses @ CES

- http://ces.itec.kit.edu/69.php
- **Note:** Info on homepage is typically not up-to-date
  - If you are interested in a particular topic: better ask individually
- There are often SADABAMA theses or Hiwi jobs available in the scope of reconfigurable systems
- Main projects:
  - *i*-Core: invasive Core
  - Real-time: Analysis and design of predictable reconfigurable architectures
  - OTERA: Online Test Strategies for Reliable Reconfigurable Architectures
- Topics:
  - Algorithms for Runtime System, Operating System, ...
  - Static Program Analysis, Toolchain, Compiler, Synthesis, ...
  - Architecture, Hardware Prototype, Simulation Environment, ...

# **Beneficial Previous Knowledge**

- Rechnerstrukturen
  - Prerequisites
- Eingebettete Systeme
  - ES1: Optimierung und Synthese Eingebetteter Systeme
  - ES2: Entwurf und Architekturen für Eingebettete Systeme
  - The core topics (e.g. details about FPGA architectures) will be recapitulated in the scope of this lecture
  - Thus, the contents of ES1 and ES2 are beneficial but not required in full detail

### **General Literature**

- "Fine- and Coarse-Grain Reconfigurable Computing",
   S. Vassiliadis and D. Soudris, Springer 2007.
- "Runtime adaptive extensible embedded processors a survey", H. P. Huynh and T. Mitra, SAMOS, pp. 215–225, 2009.
- "Reconfigurable computing: architectures and design methods", T.J. Todman et al., IEE Proceedings Computers & Digital Techniques, vol. 152, no. 2, pp. 193-207, 2005.
- "Reconfigurable Instruction Set Processors from a Hardware/Software Perspective", F. Barat et al., IEEE Transactions on Software Engineering, vol. 28, no. 9, pp. 847-862, 2002.



Institut für Technische Informatik Chair for Embedded Systems - Prof. Dr. J. Henkel

Karlsruhe Institute of Technology

# **Reconfigurable and** Adaptive Systems (RAS)

1. Introduction and Motivation: The Demand for Adaptivity



# **Designing Embedded Systems**

- Typical approach:
  - Static analysis of system requirements (e.g. computational hot spots)
  - Build optimized system
- Today's requirements:
  - More functionality
  - Increasing complexity
  - Non-functional constraints
- Problem:
  - Statically chosen design point has to match all requirements
  - Typically inefficient for individual components (e.g. tasks or hot spots)



### Definition 'Computational Hot Spot'

- A rather small part of the application that corresponds to a rather large part of the execution time
  - Also called 'Computational Kernel'
  - Typically: inner loop
  - 80/20 rule (90/10 rule etc.)



### **Typical Implementation Alternatives**



#### First Example: H.324 Video Conferencing

- Video En-/Decoding
- Audio En-/Decoding
- Data (De-)Multiplexing
- Control protocol





M. Damschen, KIT, 2016

#### Hotspots in H.324 Video Conferencing



# **ASIP** Implementation

Base ISA Feature

**Optional Function** 

**Designer-Defined Features (TIE)** 

- Design accelerators for the hot spots
- Connect them as Execution Units, **Register Files**, and Interfaces



src: Tensilica, Inc.: "Xtensa LC Product Brief"

# ASIP Implementation (cont'd)

- Provides noticeably improved performance after targeting the major hot spots
- However, performance still not suf- ficient to achieve real- time require- ments
  - More hot spots need to be accelerated



src: Tensilica, Inc.: "Xtensa LC Product Brief"

# ASIP Implementation (cont'd)

- Scalability problem when rather many hotpots exist
  - Note: still not all relevant hot spots are covered



### Summary of ASIP Implementation

- ASIPs perform well when
  - 1. rather few hot spots need to be accelerated and
  - 2. those hot spots are well known in advance
- ASIPs are less efficient when targeting

#### many hot spots or unknown hot spots

- All accelerators are provided statically (i.e. they require area and consume power) even though typically just a few of them are needed at a certain time
- Even for a given application it may not necessarily be clear, which parts are 'hot' during execution as this may depend on input data (as demonstrated in the following)
- In such situations a different architecture might be preferable

#### Second Example: H.264 video Encoder



- Iterates on MacroBlocks (MBs, i.e. 16x16 pixels)
- > 2 different MB-types
  - → different computational paths with different computational requirements
  - I-MB (spatial prediction)
    - P-MB (temporal prediction)

0

### **Example: Football Video**

I–MB P–MB

<u>Note</u>: 16x16 MBs can be partitioned into sub-MBs, e.g. 16x8, 8x8, down to 4x4



# Example: Distribution of I-MBs in Medium-to-VeryHigh Motions



### **Conclusion: Demand for Adaptivity**

- Even for a well known application it is not always clear which parts will be 'hot' (e.g. according computational complexity) and thus benefit from accelerators
  - This depends on changing input data and control flow
- Even more complex: multi-tasking scenarios
  - Not clear, which applications will execute at the same time
  - Not clear, which applications will execute at all (user can download new applications)
  - This significantly increases the number of potential hot spots
     Andly possible to address this with an ASIP
- Systems that fulfill the demand for adaptivity may lead to
  - **Better performance** (absolute criteria)
  - Higher Efficiency (relative criteria e.g. performance per area etc.)
  - Lower cost (no redesign if specifications change, no overdesign to cover all scenarios)

# Potentials of RAS



# Potentials of RAS (cont'd)

- Providing accelerators for hot spots on demand
- Efficient dependability/reliability and fault tolerance
  - Rather than providing static redundancy or hardened devices, use online monitoring (BIST: Build-in Self-Test) to detect faults and use reconfiguration and adaptation to react accordingly
- Reducing the design/development costs
  - Hardware bug fixes, hardware updates
  - Avoids hardware redesign
- Shorter Time-to-market
  - The time between idea and product
- Improved efficiency
  - E.g. energy reduction due to better resource utilization
- So-called 'Self-x' properties (explained in the following)



#### Self-organisation/Selfconfiguration

- The ability to determine and establish feasible/ good setups
  - Composed out of predetermined elements
  - Or created from scratch (online-synthesis)
  - Or implicitly created (emergent behavior)



#### Self-adaptation/Self-optimization

- The ability to modify/ improve the system setup towards maximizing a certain cost function (e.g. performance, energy saving, or efficiency)
- The cost function is not necessarily fixed, but it may vary, depending on external requirements, goals etc.



src: M. C. Escher

# Self-healing

- The ability to resist, tolerate, or correct certain faults
- It is not necessarily required to explicitly detect them
- It is not necessarily required to operate with the same performance, efficiency etc. as before the fault
  - Graceful degradation



src: T-800; spill.com



src: T-1000; geekologie.com



src: T-1000; movie-infos.net

# **Sneak Preview**

#### Techniques for (Self-) Reconf.

- How to use/develop/reconfigure accelerators
- Optimizations (compile time/run time)
- Different flavors of reconfigurable processors
  - Basic systems
  - Highly efficient/adaptive systems
  - Online synthesis
- New Technologies for reconfigurable devices and innovative products
- Improving system reliability by reconfiguration



src: Mars Rover, newscientist.com



src: CERN, nytimes.com